Classification of Tweets via Clustering of Hashtags

نویسندگان

  • DOLAN ANTENUCCI
  • GREGORY HANDY
  • AKSHAY MODI
  • MILLER TINKERHESS
چکیده

We present two techniques to aid in the retrieval of information from Twitter. First we present a technique to cluster hashtags in meaningful topic groups using a combination of co-occurrence frequency, graph clustering and textual similarity. Second, we present a technique to classify a tweet in terms of these topic groups based on their word content using a combination of PCA dimensionality reduction and a variety of multi-class classification algorithms. We examine the relationship between the clustering step and the classification step to evaluate the performance of each.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hashtag Processing for Enhanced Clustering of Tweets

Rich data provided by tweets have been analyzed, clustered, and explored in a variety of studies. Typically those studies focus on named entity recognition, entity linking, and entity disambiguation or clustering. Tweets and hashtags are generally analyzed on sentential or word level but not on a compositional level of concatenated words. We propose an approach for a closer analysis of compound...

متن کامل

Bootstrapped Learning of Emotion Hashtags #hashtags4you

We present a bootstrapping algorithm to automatically learn hashtags that convey emotion. Using the bootstrapping framework, we learn lists of emotion hashtags from unlabeled tweets. Our approach starts with a small number of seed hashtags for each emotion, which we use to automatically label tweets as initial training data. We then train emotion classifiers and use them to identify and score c...

متن کامل

Spatial Hashtags in Tweets

Twitter is an on-line social networking service which enables users to communicate by sending and reading up to 140 characters short text called ”tweets”. Users can attach any hashtag, starting with an ”#”, to tweets to indicate additional or summary information. tweets with explicit or implicit spatial intent can be classified as spatial tweets. Spatial tweets are commonly seen in practice, an...

متن کامل

Evaluating the Effectiveness of Hashtags as Predictors of the Sentiment of Tweets

Twitter is a microblogging application, which has gartered much interest in recent years. The main source of attraction is its user-generated content called tweets, that are created daily by users. Tweets are 140-character text messages expressing opinions about different topical issues. They are highly informal, and compact with many different conversational features, some of which are specifi...

متن کامل

Automatic Hashtag Recommendation for Microblogs using Topic-Specific Translation Model

Microblogging services continue to grow in popularity, users publish massive instant messages every day through them. Many tweets are marked with hashtags, which usually represent groups or topics of tweets. Hashtags may provide valuable information for lots of applications, such as retrieval, opinion mining, classification, and so on. However, since hashtags should be manually annotated, only ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011